Update nanojson to include upstream and FireMasterK's optimizations #1382

Stypox · 2025-10-03T14:06:02Z

This PR updates nanojson to the commit from TeamNewPipe/nanojson#27.

I included FireMasterK@165b459 as suggested in TeamNewPipe/nanojson#27 (comment) and had to make some further changes to make all tests pass (in particular YoutubeStreamExtractor::getTags() was failing).

Just to be sure, I went through all of the raw usages of LinkedHashMap::get() (and similar) (i.e. JsonObject's base class) to make sure that we weren't depending on the types returned by direct raw access to the json object. I couldn't find other problematic places (except for timeago-generator), but I still replaced some raw accesses here and there.

@FireMasterK you might be interested in including my changes in your version of the extractor :-)

I carefully read the contribution guidelines and agree to them.
I have tested the API against NewPipe.
I agree to create a pull request for NewPipe as soon as possible to make it compatible with the changed API.

Stypox · 2025-10-03T14:49:18Z

After making sure nanojson produces JDK 11-compatible binaries, the CI here builds (and all tests pass).

get() may return objects with types that are used internally in the nanojson library, such as LazyString or Number. Use specialized methods instead. This commit removes all usages (except in SoundcloudParsingHelper::resolveIdWithWidgetApi()). By doing so, timeago-generator now compiles again, and YoutubeChannelExtractor::getTags() and JsonUtils tests pass again. The compilation and tests failed because nanojson now internally uses LazyString instead of plain String.

TobiGr · 2025-10-04T14:46:55Z

LGTM, but the mock tests were executed slower with these changes included (ca. 300ms additional runtime on average over 10 runs). However, I do not know how the Junit tests are run and how they are different to the extractor usage on mobile devices.

Stypox · 2025-10-06T09:47:53Z

I collected 10 samples from before and after, and I calculated the average and standard deviation. Given that data, we can't conclude which of the two commits is faster, since the two distributions have basically the same mean and high stddev. I don't think the time taken to run tests is a good measure of performance, I'm pretty sure Gradle introduces a lot of overhead to report test results.

Details

1f6cb35 (after):
4.099
3.877
3.762
3.801
3.691
3.971
3.686
3.733
3.673
3.772
avg = 3.8065
std = 0.138

837705a (before):
4.191
3.918
4.009
3.757
3.697
3.708
3.719
3.670
3.770
3.687
avg = 3.8126
std = 0.172

Stypox · 2025-10-06T10:08:42Z

Wait I wrote this simple performance test that just tries to open all mock JSON files. It takes 1.8s on 837705a (before) and takes 26.5s on 1f6cb35 (i.e. this PR). This is very strange...

Path resourcePath = Paths.get("src", "test", "resources");
var jsonFiles = new ArrayList<>();
for (Path p : Files.walk(resourcePath).collect(Collectors.toList())) {
    if (!Files.isRegularFile(p) || !p.toString().endsWith(".json")) {
        continue;
    }
    jsonFiles.add(JsonParser.any().from(new FileInputStream(p.toFile())));
}
assertEquals(808, jsonFiles.size());

FireMasterK · 2025-10-06T10:10:35Z

The major efficiency gains will be when the application is running for a long time and Lazy Strings.

String deserialization takes time because of the UTF validation, Lazy String which will reduce CPU time to parse strings only when necessary, which is quite useful for us since we don't parse and read all Strings in every JSON Object. I borrowed this idea from the Jackson library.

The buffer pool is useful when the application is long-running - it reduces GC thrashing, and the overhead of creating a buffer for parsing JSON each time. The benefits here would be more visible when a lot of JSON is being parsed or the application runs for a long time, so that the buffers are reused instead of recreated - mainly server use-cases like Piped rather than the NewPipe application.

The best way to benchmark would be to use a Micro Benchmarking tool like JMH.

FireMasterK · 2025-10-06T10:21:27Z

Wait I wrote this simple performance test that just tries to open all mock JSON files. It takes 1.8s on 837705a (before) and takes 26.5s on 1f6cb35 (i.e. this PR). This is very strange...

Path resourcePath = Paths.get("src", "test", "resources");
var jsonFiles = new ArrayList<>();
for (Path p : Files.walk(resourcePath).collect(Collectors.toList())) {
    if (!Files.isRegularFile(p) || !p.toString().endsWith(".json")) {
        continue;
    }
    jsonFiles.add(JsonParser.any().from(new FileInputStream(p.toFile())));
}
assertEquals(808, jsonFiles.size());

That's very odd, I will take a look later to see what's causing the issue with a profiler later.

Stypox · 2025-10-06T11:20:37Z

@FireMasterK I already discovered why: that set of JSON files contains very long strings and the reusableBuffer's size was growing linearly instead of exponentially

FireMasterK · 2025-10-06T16:30:04Z

@FireMasterK I already discovered why: that set of JSON files contains very long strings and the reusableBuffer's size was growing linearly instead of exponentially

Ah, that would make sense! The current logic grows the buffer size in increments of 512 as needed. Could you check with FireMasterK/nanojson@aae1500, where I double size instead of growing more conservatively?

Stypox · 2025-10-06T20:40:05Z

@FireMasterK yes that works. I actually used the diff below but it's doing the same thing; I found that 1.3 also works fine as a constant which might save some memory in extreme cases.

Diff

commit e890a9e85df905aeb8cff91f43c612c643d37127
Author: Stypox
Date:   Mon Oct 6 21:57:52 2025 +0200

    Ensure exponential growth of container size
    
    This makes it so that even for long strings the parse time is O(n) amortized instead of O(n²)

diff --git a/src/main/java/com/grack/nanojson/JsonTokener.java b/src/main/java/com/grack/nanojson/JsonTokener.java
index 3fa5807..5e1455a 100644
--- a/src/main/java/com/grack/nanojson/JsonTokener.java
+++ b/src/main/java/com/grack/nanojson/JsonTokener.java
@@ -762,7 +762,16 @@ final class JsonTokener implements Closeable {
 	private void expandBufferIfNeeded(int size) {
 		if (reusableBuffer.remaining() < size) {
 			int oldPos = reusableBuffer.position();
-			int increment = Math.max(512, size - reusableBuffer.remaining());
+			int increment = Math.max(
+					// don't reallocate too small parts
+					512,
+					Math.max(
+							// allocate at least as much as needed
+							size - reusableBuffer.remaining(),
+							// ensure exponential growth so that it's O(n) amortized
+							(int) (reusableBuffer.capacity() * 0.3)
+					)
+			);
 			CharBuffer newBuffer = CharBuffer.allocate(reusableBuffer.capacity() + increment);
 			reusableBuffer.flip(); // position -> 0, limit -> oldPos
 			newBuffer.put(reusableBuffer); // copy all existing data

Anyway, I tried to benchmark the old version and the new version using the data stored in the mocks (which is the kind of data we care about). Unfortunately I didn't get any significant performance improvement, it's basically the same. I didn't use JMH though, as it seems a bit complicated to setup. If you have it at hand and think it would give clearer results I'd appreciate if you could take my code and run it in JMH.

Code

Path resourcePath = Paths.get("src", "test", "resources");
var files = Files.walk(resourcePath).collect(Collectors.toList());
var jsonSamples = new ArrayList<String>();
for (Path p : files) {
    if (!Files.isRegularFile(p) || !p.toString().endsWith(".json")) {
        continue;
    }
    try {
        var mockRequest = JsonParser.parseReader(new FileReader(p.toFile()))
                .getAsJsonObject()
                .getAsJsonObject("response");
        if (mockRequest.getAsJsonObject("responseHeaders")
                .getAsJsonArray("content-type")
                .get(0)
                .getAsString()
                .contains("json")) {
            //jsonSamples.add(Files.readString(p));
            jsonSamples.add(mockRequest.get("responseBody").getAsString());
        }
    } catch (ClassCastException | IllegalArgumentException | NullPointerException
            | IllegalStateException ignored) {
    }
}

assertEquals(525, jsonSamples.size());

var totaltime = Duration.ZERO;
int N = 20;
Duration first = null;
for (int i = 0; i < N+1; ++i) {
    System.gc();
    var t0 = Instant.now();
    for (String content : jsonSamples) {
        com.grack.nanojson.JsonParser.any().from(content);
    }
    var t1 = Instant.now();
    var spent = Duration.between(t0, t1);
    System.out.println("This one took " + spent);
    if (first == null) {
        first = spent;
    } else {
        totaltime = totaltime.plus(spent);
    }
}
System.out.println("Average " + totaltime.dividedBy(N) + "    cold start " + first);

Results

All CPU cores fixed to 2.434GHz, 8GB of RAM given to Java, I did 5 manual runs of the same

TeamNewPipe/nanojson@e9d656d (before + changes to upstream):
Average PT0.424434979S cold start PT0.600228876S
Average PT0.421854358S cold start PT0.582881900S
Average PT0.433531221S cold start PT0.644534680S
Average PT0.409151567S cold start PT0.582754237S
Average PT0.413803543S cold start PT0.621497063S

TeamNewPipe/nanojson@bc71b09 (after + exponential buffer growth):
Average PT0.419199352S cold start PT0.603883384S
Average PT0.429111054S cold start PT0.557645143S
Average PT0.425892633S cold start PT0.584124546S
Average PT0.437804042S cold start PT0.563461599S
Average PT0.425330307S cold start PT0.572033185S

TeamNewPipe/nanojson@df185fe (after - Java 11 updates):
Average PT0.418180127S cold start PT0.563748461S
Average PT0.435960600S cold start PT0.582034896S
Average PT0.420477865S cold start PT0.581220988S
Average PT0.429573602S cold start PT0.571339387S
Average PT0.436934119S cold start PT0.575480546S

FireMasterK · 2025-10-08T23:35:08Z

My fork:

# Warmup Iteration   1: 0.276 s/op
# Warmup Iteration   2: 0.259 s/op
Iteration   1: 0.258 s/op
Iteration   2: 0.258 s/op
Iteration   3: 0.258 s/op
Iteration   4: 0.257 s/op
Iteration   5: 0.257 s/op
Iteration   6: 0.257 s/op
Iteration   7: 0.257 s/op
Iteration   8: 0.258 s/op
Iteration   9: 0.257 s/op
Iteration  10: 0.256 s/op


Result "benchmarks.JsonParsingBenchmark.parseAll":
  0.257 ±(99.9%) 0.001 s/op [Average]
  (min, avg, max) = (0.256, 0.257, 0.258), stdev = 0.001
  CI (99.9%): [0.256, 0.258] (assumes normal distribution)

NewPipe nanojson fork:

# Warmup Iteration   1: 0.276 s/op
# Warmup Iteration   2: 0.259 s/op
Iteration   1: 0.258 s/op
Iteration   2: 0.258 s/op
Iteration   3: 0.258 s/op
Iteration   4: 0.257 s/op
Iteration   5: 0.257 s/op
Iteration   6: 0.257 s/op
Iteration   7: 0.257 s/op
Iteration   8: 0.258 s/op
Iteration   9: 0.257 s/op
Iteration  10: 0.256 s/op


Result "benchmarks.JsonParsingBenchmark.parseAll":
  0.257 ±(99.9%) 0.001 s/op [Average]
  (min, avg, max) = (0.256, 0.257, 0.258), stdev = 0.001
  CI (99.9%): [0.256, 0.258] (assumes normal distribution)

Pre-update nanojson fork latest:

# Warmup Iteration   1: 0.257 s/op
# Warmup Iteration   2: 0.243 s/op
Iteration   1: 0.245 s/op
Iteration   2: 0.245 s/op
Iteration   3: 0.242 s/op
Iteration   4: 0.242 s/op
Iteration   5: 0.240 s/op
Iteration   6: 0.243 s/op
Iteration   7: 0.242 s/op
Iteration   8: 0.242 s/op
Iteration   9: 0.244 s/op
Iteration  10: 0.242 s/op


Result "benchmarks.JsonParsingBenchmark.parseAll":
  0.243 ±(99.9%) 0.003 s/op [Average]
  (min, avg, max) = (0.240, 0.243, 0.245), stdev = 0.002
  CI (99.9%): [0.240, 0.245] (assumes normal distribution)

You are right, I'm not sure what's going on, I will need to have a deeper look.

Stypox force-pushed the nanojson2 branch 2 times, most recently from b4d6e15 to 144520a Compare October 3, 2025 14:46

Stypox and others added 3 commits October 4, 2025 14:18

Update nanojson to include upstream and FireMasterK's optimizations

0d6b011

Fixes for LazyString parsing.

13c61d3

Stypox force-pushed the nanojson2 branch from 144520a to 1f6cb35 Compare October 4, 2025 12:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Update nanojson to include upstream and FireMasterK's optimizations #1382

Update nanojson to include upstream and FireMasterK's optimizations #1382

Stypox commented Oct 3, 2025

Uh oh!

Stypox commented Oct 3, 2025

Uh oh!

TobiGr commented Oct 4, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Update nanojson to include upstream and FireMasterK's optimizations #1382

Are you sure you want to change the base?

Update nanojson to include upstream and FireMasterK's optimizations #1382

Conversation

Stypox commented Oct 3, 2025

Uh oh!

Stypox commented Oct 3, 2025

Uh oh!

TobiGr commented Oct 4, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 6, 2025

Uh oh!

Stypox commented Oct 6, 2025

Uh oh!

FireMasterK commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FireMasterK commented Oct 8, 2025 •

edited

Loading